Variational Inference with Stein Mixtures
نویسنده
چکیده
Obtaining uncertainty estimates is increasingly important in modern machine learning, especially as models are given an increasing amount of responsibility. Yet, as the tasks undertaken by automation become more complex, so do the models and accompanying inference strategies. In fact, exact inference is often impossible in practice for modern probabilistic models. Thus, performing variational inference (VI) [12] accurately and at scale has become essential. As posteriors are often complicated—exhibiting multiple modes, skew, correlations, symmetries—recent VI research has focused on using either implicit [20, 23, 18] or mixture approximations [10, 1, 7, 19, 9]. The former makes no strong parametric assumption, using either samples from a black-box function [20] or particles [16, 22] to represent the posterior, and the latter uses a convex combination of parametric forms as the variational distribution. While, in theory, these methods are expressive enough to capture any posterior, practical issues can prevent them from being a good approximation. For instance, the discrete approximation that makes implicit methods so flexible does not scale with dimensionality, as an exponential number of samples/particles are needed to cover the posterior. Mixtures, on the other hand, can scale well, but their optimization objective is problematic and requires using a lower bound to the evidence lower bound (ELBO) (i.e. a lower bound of a lower bound on the marginal likelihood) [10, 7, 21].
منابع مشابه
Variational inference for Dirichlet process mixtures
Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of Monte-Carlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis problems. However, MCMC sampling can be prohibitively slow, and it is important to explore alternatives. One cl...
متن کاملOperator Variational Inference
Variational inference is an umbrella term for algorithms which cast Bayesian inference as optimization. Classically, variational inference uses the Kullback-Leibler divergence to define the optimization. Though this divergence has been widely used, the resultant posterior approximation can suffer from undesirable statistical properties. To address this, we reexamine variational inference from i...
متن کاملVariational Inference for Policy Gradient
Inspired by the seminal work on Stein Variational Inference [2] and Stein Variational Policy Gradient [3], we derived a method to generate samples from the posterior variational parameter distribution by explicitly minimizing the KL divergence to match the target distribution in an amortize fashion. Consequently, we applied this varational inference technique into vanilla policy gradient, TRPO ...
متن کاملLifted Relational Variational Inference
Hybrid continuous-discrete models naturally represent many real-world applications in robotics, finance, and environmental engineering. Inference with large-scale models is challenging because relational structures deteriorate rapidly during inference with observations. The main contribution of this paper is an efficient relational variational inference algorithm that factors largescale probabi...
متن کاملStein Variational Gradient Descent: Theory and Applications
Although optimization can be done very efficiently using gradient-based optimization these days, Bayesian inference or probabilistic sampling has been considered to be much more difficult. Stein variational gradient descent (SVGD) is a new particle-based inference method derived using a functional gradient descent for minimizing KL divergence without explicit parametric assumptions. SVGD can be...
متن کامل